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DESCRIPTION 



MATERIALS AND METHODS FOR 
INCREASING CORN SEED WEIGHT 

5 

This invention was made with government support under National Science Foundation grant 
number 930528 1 8. The government has certain rights in this invention. 

Cross-Reference to a Related Application 
10 This application is a amtinuation-in-part of co-pending application Serial No. 08/299,675 , 

filed September 1, 1994. 

Background of the Invention 
ADP-glucose pyrophosphorylase (AGP) catalyzes the conversion of ATP and a -glucose- 1- 

15 phosphate to ADP-glucose and pyrophosphate. ADP-glucose is used as a glycosyl donor in starch 
biosynthesis by plants and in glycogen biosynthesis by bacteria. The importance of ADP-glucose 
pyrophosphorylase as a key enzyme in the regulation of starch biosynthesis was noted in the study 
of starch deficient mutants of maize (Zee? mays) endosperm (Tsai and Nelson, 1966; Dickinson and 
Preiss, 1969). AGP enzymes have been isolated from both bacteria and plants. Bacterial AGP 

20 consists of a homotetramer, while plant AGP from photosynthetic and non-photosynthetic tissues 
is a heterotetramer composed of two different subunits. The plant enzyme is encoded by two 
different genes, with one subunit being larger than the other. This feature has been noted in a 
number of plants. The AGP subunits in spinach leaf have molecular weights of 54 kDa and 5 1 kDa, 
as estimated by SDS-PAGE. Both subunits are immunoreactive with antibody raised against 

25 purified AGP from spinach leaves (Copeland and Preiss, 1981; Morell etal., 1987). Immunological 
analysis using antiserum prepared against the small and large subunits of spinach leaf showed that 
potato tuber AGP is also encoded by two genes (Okita et ai t 1 990). The cDNA clones of the two 
subunits of potato tuber (50 and 51 kDa) have also been isolated and sequenced (Muller-Rober et 
aL, 1990;Nakata era/., 1991). 

30 As Hannah and Nelson (Hannah and Nelson, 1975 and 1 976) postulated, both Shrunken-2 

(Sh2) (Bhave et a!., 1990) and Brittle-2 (Bt2) (Bae et ai, 1990) are structural genes of maize 
endosperm ADP-glucose pyrophosphorylase. Sh2 and Bt2 encode the large subunit and small 
subunit of the enzyme, respectively. From cDNA sequencing, Sh2 and Bt2 proteins have predicted 
molecular weight of 57,179 Da (Shaw and Hannah, 1992) and 52,224 Da, respectively. The 
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endosperm is the site of most starch deposition during kernel development in maize. Sh2 and bt2 
maize endosperm mutants have greatly reduced starch levels corresponding to deficient levels of 
AGP activity. Mutations of either gene have been shown to reduce AGP activity by about 95% (Tsai 
and Nelson, 1966; Dickinson and Preiss, 1969). Furthermore, it has been observed that enzymatic 
5 activities increase with the dosage of functional wild type Sh2 and Bt2 alleles, whereas mutant 
enzymes have altered kinetic properties. AGP is the rate limiting step in starch biosynthesis in 
plants Stark et al placed a mutant form of £ coli AGP in potato tuber and obtained a 35% increase 
in starch content (Stark, 1992). 

The cloning and characterization of the genes encoding the AGP enzyme subunits have been 
10 reported for various plants. These include Sh2 cDNA (Bhave et a/., 1990), Sh2 genomic DNA 
(Shaw and Hannah, 1992), and Bt2 cDNA (Bae et aL, 1990) from maize; small subunit cDNA 
(Anderson et ai, 1989) and genomic DNA (Anderson et at., 1991) from rice; and small and large 
subunit cDNAs from spinach leaf (Morell et al„ 1987) and potato tuber (Muller-Rober et al, 1990; 
Nakata et a/., 1991). In addition, cDNA clones have been isolated from wheat endosperm and leaf 
15 tissue (Olive et a/., 1989) m&Arabidopsis thaliana leaf (Lin et al. y 1988). 

AGP functions as an allosteric enzyme in all tissues and organisms investigated to date. The 
allosteric properties of AGP were first shown to be important in E. coli. A glycogen-overproducing 
£. coli ' mutant was isolated and the mutation mapped to the structural gene for AGP, designated as 
glyC. The mutant R coli, known as gJyC-16, was shown to be more sensitive to the activator, 
20 fructose 1,6 bisphosphate, and less sensitive to the inhibitor, cAMP (Preiss, 1 984). Although plant 
AGP's are also allosteric, they respond to different effector molecules than bacterial AGP's. In 
plants, 3-phosphoglyceric acid (3-PGA) functions as an activator while phosphate (P0 4 ) serves as 
an inhibitor (Dickinson and Preiss, 1969). 

In view of the fact that endosperm starch content comprises approximately 70% of the dry 
25 weight of the seed, alterations in starch biosynthesis correlate with seed weight. Unfortunately, the 
undesirable effect associated with such alterations has been an increase in the relative starch content 
of the seed. Therefore, the development of a method for increasing seed weight in plants without 
increasing the relative starch content of the seed is an object of the subject invention. 

30 Brief Summary of the Invention 

The subject invention concerns a novel variant of the Shrunken-2 (Sh2) gene from maize. 
The Shi gene encodes ADP-glucose pyrophosphorylase (AGP), an important enzyme involved in 
starch synthesis in the major part of the corn seed, the endosperm. In a preferred embodiment, the 
novel gene of the subject invention encodes a variant AGP protein which has two additional amino 
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acids inserted into the sequence. The variant gene described herein has been termed the Sh2-mlRev6 
gene. Surprisingly, the presence of the Sh2-mlRev6 gene in a corn plant results in a substantial 
increase in com seed weight when compared to wild type seed weight, but does so in the absence of 
an increase in the relative starch content of the kernel. 

5 The subject invention further concerns a method of using the variant sh2 gene in maize to 

increase seed weight. The subject invention also concerns plants having the variant sh2 gene and 
expressing the mutant protein in the seed endosperm. 

As described herein, the shl variant, Sh2-mlRev6, can be produced using in vivo, site- 
specific mutagenesis. A transposable element was used to create a series of mutations in the 

0 sequence of the gene that encodes the enzyme. As a result, the Sh2-mlRev6 gene encodes an 
additional amino acid pair within or close to the allosteric binding site of the protein. 

Brief Description of the Sequences 
SEQ ID NO, 1 is the genomic nucleotide sequence of the Sh2-mlRev6 gene, 
5 SEQ ID NO. 2 is the nucleotide sequence of the Sh2-mlRev6 cDNA. 

SEQ ID NO. 3 is the amino acid sequence of the protein encoded by nucleotides 87 through 
1640 of SEQ ID NO. 2. 

SEQ ID NO. 4 is a nucleotide sequence encoding the amino acid sequence shown in SEQ 
ID NO. 5. 

0 SEQ ID NO. 5 is the amino acid sequence of an ADP-glucose pyrophosphorylase (AGP) 

enzyme subunit containing a single serine insertion. 

Detailed Disclosure of the Invention 
The subject invention provides novel variants of the Shrunken-2 (Sh2) gene and a method 

5 for increasing seed weight in a plant through the expression of the variant sh2 gene. The Sh2 gene 
encodes a subunit of the enzyme ADP-glucose pyrophosphorylase (AGP) in maize endosperm. One 
variant gene, denoted herein as Sh2-mlRev6, contains an insertion mutation that encodes an 
additional tyrosine:serine or serine:tyrosine amino acid pair that is not present in the wild type 
protein. The sequences of the wild type DNA and protein are disclosed in Shaw and Hannah, 1992. 

0 The in vivo, site-specific mutation which resulted in the tyrosine:serine or serineityrosine insertion, 
was generated in Sh2 using the transposable element, dissociation (Ds\ which can insert into, and 
be excised from, the Sh2 gene under appropriate conditions. Ds excision can alter gene expression 
through the addition of nucleotides to a gene at the site of excision of the element. 
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In a preferred embodiment, insertion mutations in the Sh2 gene were obtained by screening 
for germinal revertants after excision of the Ds transposon from the gene. The revertants were 
generated by self-pollination of a stock containing the Ds-Sh2 mutant allele, the Activator (Ac) 
element of this transposable element system, and appropriate outside markers. The Ds element can 
5 transpose when the ,4c element is present. Wild type seed were selected, planted, self-pollinated and 
crossed onto a tester stock. Results from this test cross were used to remove wild type alleles due 
to pollen contamination. Seeds homozygous for each revertant allele were obtained from the self- 
progeny. Forty-four germinal revertants of the Ds-induced sh2 mutant were collected. 

Cloning and sequencing of the Ds insertion site showed that the nucleotide insertion resides 
10 in the area of the gene that encodes the binding site for the AGP activator, 3-PGA (Morrell, 1988). 
Of the 44 germinal revertants obtained, 28 were sequenced. The sequenced revertants defined 5 
isoalleles of sh2\ 13 restored the wild type sequence, 1 1 resulted in the insertion of the amino acid 
tyrosine, two contained an additional serine (inserted between amino acid residues 494 and 495, 
respectively, of the native protein sequence), one revertant contained a two amino acid insertion, 
15 tyrosine:tyrosine, and the last one, designated as Sh2-mlRev6, contained the two amino acid 
insertion, tyrosine :serine or serine.tyrosine. The Sh2-mlRev6 variant encodes an AGP enzyme 
subunit that has either the serine:tyrosine amino acid pair inserted between the glycine and tyrosine 
at amino acid residues 494 and 495, respectively, of the native protein, or the serineityrosine amino 
acid pair inserted between the two tyrosine residues located at position 495 and 496 of the native 
20 protein sequence. Due to the sequence of the amino acids in the area of the insertions, the Sh2- 
m!Rev6 variant amino acid sequences encoded by each of these insertions are identical to each other. 

Surprisingly, the expression of the Sh2-mlRev6 gene in maize resulted in a significant 
increase in seed weight over that obtained from maize expressing the wild-type Sh2 allele. 
Moreover, seeds from plants having the Sh2-mlRev6 gene contained approximately the same 
25 percentage starch content relative to any of the other revertants generated. In a preferred 
embodiment, the Sh2-mlRev6 gene is contained in homozygous form within the genome of a plant 
seed. 

The subject invention further concerns a plant that has the Sh2-mlRev6 gene incorporated 
into its genome. Other alleles disclosed herein can also be incorporated into a plant genome. In a 
30 preferred embodiment, the plant is a monocotyledonous species. More preferably, the plant may be 
Zea mays. Plants having the Sh2-mIRev6 gene can be grown from seeds that have the gene in their 
genome. In addition, techniques for transforming plants with a gene are known in the art. 

Because of the degeneracy of the genetic code, a variety of different polynucleotide 
sequences can encode the variant AGP polypeptide disclosed herein. In addition, it is well within 
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the skill of a person trained in the art to create alternative polynucleotide sequences encoding the 
same, or essentially the same, polypeptide of the subject invention. These variant or alternative 
polynucleotide sequences are within the scope of the subject invention. As used herein, references 
to "essentially the same" sequence refers to sequences which encode amino acid substitutions, 
5 deletions, additions, or insertions which do not materially alter the functional activity of the 
polypeptide encoded by Sh2-mlRev6 or the other alleles. The subject invention also contemplates 
those polynucleotide molecules having sequences which are sufficiently homologous with the wild 
type Sh2 DNA sequence so as to permit hybridization with that sequence under standard high- 
stringency conditions. Such hybridization conditions are conventional in the art (see, e.g., Maniatis 
10 et aL, 1989). 

The polynucleotide molecules of the subject invention can be used to transform plants to 
express the Sh2-mlRev6 allele, or other alleles of the subject invention, in those plants. In addition, 
the polynucleotides of the subject invention can be used to express the recombinant variant AGP 
enzyme. They can also be used as a probe to detect related enzymes. The polynucleotides can also 
15 be used as DNA sizing standards. 

The polypeptides encoded by the polynucleotides of the subject invention can be used to 
catalyze the conversion of ATP and a -glucose- 1 -phosphate to ADP-glucbse and pyrophosphate, or 
to raise an immunogenic response to the AGP enzymes and variants thereof. They can also be used 
as molecular weight standards, or as an inert protein in an assay. 

20 

The following are examples which illustrate procedures and processes, including the best 
mode, for practicing the invention. These examples should not be construed as limiting, and are not 
intended to be a delineation of all possible modifications to the technique. All percentages are by 
weight and all solvent mixture proportions are by volume unless otherwise noted. 

25 

Example 1 - Expression ofSh2-mlRev6 Gene in Maize Endosperm . 

Homozygous plants of each revertant obtained after excision of the Ds transposon were 
crossed onto the Fl hybrid corn, "Florida Stay Sweet " This sweet com contains a null allele for the 
30 Shi gene, termed sh2-R. Resulting endosperms contained one dose of the functional allele from a 
revertant and two female-derived null alleles, denoted by the following genotype Sh2-mlRevX/sh2- 
R/sh2-R t where X represents one of the various isoalleles of the revertants. Crosses were made 
during two growing seasons. 
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Resulting seed weight data for each revertant and wild type seed are shown in Table 1 . The 
first column shows the amino acid insertion in the AGP enzyme obtained after the in vivo, site- 
specific mutagenesis. 



Table 1. 


Sequence 
alteration 


# of revertants 


Average Seed weight 


Standard deviation 


wild type 


13 


0.250 grams 


0.015 


tyrosine 


11 


0.238 grams 


0.025 


serine 


2 


0.261 grams 


0.014 


tyr,tyr 


1 


0.223 grams 


nd* 


tyr, ser 


1 


0.289 grams 


0.022 



*nd » not determined 

15 

The data shown in Table 1 represents the average kernel seed weight for each revertant over 
the course of two growing seasons. The expression of the Sh2-mlRev6 gene to produce the Rev6 
mutant AGP subunit gave rise to an almost 16% increase in seed weight in comparison to the wild 
type revertant. The revertants having the single serine insertion also showed an increase in average 
20 seed weight over wild type seed weight. 

In addition, starch content was determined on the kernels analyzed above using various 
methodologies. The analysis showed that Sh2-mlRev6 containing kernels were no higher in 
percentage starch relative to kernels expressing the other alleles shown in the table above. Therefore, 
it appears that the increase in seed weight is not solely a function of starch content. 
25 Corn seeds that contain at least one functional Sh2-mlRev6 allele (the tyrosine, serine 

insertion) have been deposited with the American Type Culture Collection (ATCC), 1230 1 Parklawn 
Drive, Rockville, Maryland 20852 USA, on May 20, 1996 and assigned ATCC accesion number 
ATCC 97624. Seeds having at least one functional Sh2-mlRev20 allele (serine insertion) have also 
been deposited with ATCC on May 20, 1996 and assigned ATCC accession number ATCC 97625. 
30 The seeds have been deposited under conditions that assure that access to the biological 

material will be available during the pendency of this patent application to one determined by the 
Commissioner of Patents and Trademarks to be entitled thereto under 37 CFR 1 . 14 and 35 U.S.C. 
122. The deposit will be available as required by foreign patent laws in countries wherein 
counterparts of the subject application, or its progeny, are filed. However, it should be understood 
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that the availability of a deposit does not constitute a license to practice the subject invention in 
derogation of patent rights granted by governmental action. 

Further, the subject seed deposit will be stored and made available to the public in accord 
with the provisions of the Budapest Treaty for the Deposit of Microorganisms, i.e., it will be stored 
5 with all the care necessary to keep it viable and uncontaminated for a period of at least five years 
after the most recent request for the furnishing of a sample of the deposit, and in any case, for a 
period of at least thirty (30) years after the date of deposit or for the enforceable life of any patent 
which may issue disclosing the seed. The depositor acknowledges the duty to replace the deposit 
should the depository be unable to furnish a sample when requested, due to the condition of the 

10 deposit. All restrictions on the availability to the public of the subject seed deposit will be 
irrevocably removed upon the granting of a patent disclosing it. 

As would be apparent to a person of ordinary skill in the art, seeds and plants that are 
homozygous for the Sh2-m]Rev6 or the Sh2-mJRev20 allele can be readily prepared from 
heterozygous seeds using techniques that are standard in the art. In addition, the Sh2-mlRev6 and 

1 5 Sh2-mIRev20 genes can be readily obtained from the deposited seeds. 

The skilled artisan, using standard techniques known in the art, can also prepare 
polynucleotide molecules that encode additional amino acid residues, such as serine, at the location 
of the insertions in the subject revertants. Such polynucleotide molecules are included within the 
scope of the subject invention. 

20 It should be understood that the examples and embodiments described herein are for 

illustrative purposes only and that various modifications or changes in light thereof will be suggested 
to persons skilled in the art and are to be included within the scope and purview of this application 
and the scope of the appended claims. 
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<A) LENGTH: 7745 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TAAGAGGGGT GCACCTAGCA TAGATTTTTT GGGCTCCCTG GCCTCTCCTT TCTTCCGCCT 60 

GAAAACAACC TACATGGATA CATCTGCAAC CAGAGGGAGT ATCTGATGCT TTTTCCTGGG 120 

CAGGGAGAGC TATGAGACGT ATGTCCTCAA AGCCACTTTG CATTGTGTGA AACCAATATC 180 

GATCTTTGTT ACTTCATCAT GCATGAACAT TTGTGGAAAC TACTAGCTTA CAAGCATTAG 240 

TGACAGCTCA GAAAAAAGTT ATCTCTGAAA GGTTTCATGT GTACCGTGGG AAATGAGAAA 300 

TGTTGCCAAC TCAAACACCT TCAATATGTT GTTTGCAGGC AAACTCTTCT GGAAGAAAGG 360 

TGTCTAAAAC TATGAACGGG TTACAGAAAG GTATAAACCA CGGCTGTGCA TTTTGGAAGT 420 

ATCATCTATA GATGTCTGTT GAGGGGAAAG CCGTACGCCA ACGTTATTTA C T C AG AAA CA 480 

GCTTCAACAC ACAGTTGTCT GCTTTATGAT GGCATCTCCA CCCAGGCACC CACCATCACC 540 

TATTCACCTA TCTCTCGTGC CTGTTTATTT TCTTGCCCTT TCTGATCATA AAAAATCATT 600 

AAGAGTTTGC AAACATGCAT AGGCATATCA ATATGCTCAT TTATTAATTT GCTAGCAGAT 660 

CATCTTCCTA CTCTTTACTT TATTTATTGT TTGAAAAATA TGTCCTGCAC CTAGGGAGCT 720 

CGTATAC AG T A C C AATG CAT CTTCATTAAA TGTGAATTTC AGAAAGGAAG TAGGAACCTA 7 80 

TGAGAGTATT TTTCAAAATT AATTAGCGGC TTCTATTATG TTTATAGCAA AGGCCAAGGG 840 

CAAAATCGGA ACACTAATGA TGGTTGGTTG CATGAGTCTG TCGATTACTT GCAAGAAATG 900 

TGAACCTTTG TTTCTGTGCG TGGGCATAAA ACAAACAGCT TCTAGCCTCT TTTACGGTAC 960 

TTGCACTTGC AAGAAATGTG AACTCCTTTT CATTTCTGTA TGTGGACATA ATGCCAAAGC 1020 

ATCCAGGCTT TTTCATGGTT GTTGATGTCT TTACACAGTT CATCTCCACC AGTATGCCCT 1080 

CCTCATACTC TATATAAACA CATCAACAGC ATCGCAATTA GCCACAAGAT CACTTCGGGA 1140 

GGCAAGTGTG ATTTCGACCT TGCAGCCACC TTTTTTTGTT CTGTTGTAAG TATACTTTCC 1200 

CTTACCATCT TTATCTGTTA GTTTAATTTG TAATTGGGAA GTATTAGTGG AAAGAGGATG 1260 

AGATGCTATC AT C TATG T AC TCTGCAAATG CATCTGACGT TATATGGGCT GCTTCATATA 1320 

ATTTGAATTG CTCCATTCTT GCCGACAATA TATTGCAAGG TATATGCCTA GTTCCATCAA 1380 
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AAGTTCTGTT TTTTCATTCT AAAAGCATTT TAGTGGCACG CAATTTTGTC CATGAGGGAA 
AGGAAATCTG TTTTGGTTAC TTTGCTTGAG GTGCATTCTT CATATGTCCA GTTTTATGGA 
AGTAATAAAC TTCAGTTTGG TCATAAGATG TCATATTAAA GGGCAAACAT ATATTCAATG 
TTCAATTCAT CGTAAATGTT CCCTTTTTGT AAAAGATTGC ATACTCATTT ATTTGAGTTG 
CAGGTGTATC TAGTAGTTGG AGGAGATATG CAGTTTGCAC TTGCATTGGA CACGAACTCA 
GGTCCTCACC AG AT AAG AT C TTGTGAGGGT GATGGGATTG ACAGGTTGGA AAAATTAAGT 
ATTGGGGGCA GAAAGCAGGA GAAAGCTTTG AGAAATAGGT GCTTTGGTGG TAGAGTTGCT 
GCAACTACAC AATGTATTCT TACCTCAGAT GCTTGTCCTG AAACTCTTGT AAGTATCCAC 
CTCAATTATT ACTCTTACAT GTTGGTTTAC TTTACGTTTG TCTTTTCAAG GGAAATTTAC 
TGTATTTTTT GTGTTTTGTG GGAGTTCTAT ACTTCTGTTG GACTGGTTAT TGTAAAGATT 
TGTTCAAATA GGGTCATCTA ATAATTGTTT GAAATCTGGG AACTGTGGTT TCACTGCGTT 
CAGGAAAAAG TGAATTATTG GTTACTGCAT GAATAACTTA TGGAAATAGA CCTTAGAGTT 
GCTGCATGAT TATCACAAAT CATTGCTACG ATATCTTATA ATAGTTCTTT CGACCTCGCA 
TTACATATAT AACTGCAACT CCTAGTTGCG TTCAAAAAAA AAAATGCAAC TCTTAGAACG 
CTCACCAGTG TAATCTTTCC TGAATTGTTA TTTAATGG C A TGTATGCACT ACTTGTATAC 
TTATCTAGGA TTAAGTAATC TAACTCTAGG CCCCATATTT GCAGCATTCT CAAACACAGT 
CCTCTAGGAA AAATTATGCT GATGCAAACC GTGTATCTGC TATCATTTTG GGCGGAGGCA 
CTGGATCTCA GCTCTTTCCT CTGACAAGCA CAAGAGCTAC GCCTGCTGTA AGGGATAACA 
CTGAACATCC AACGTTGATT ACTCTATTAT AGTATTATAC AG AC TGT ACT TTTCGAATTT 
ATCTTAGTTT TCTACAATAT TTAGTGGATT CTTCTCATTT TCAAGATACA CAATTGATCC 
ATAATCGAAG TGGTATGTAA GACAGTGAGT TAAAAGATTA TATTTTTTGG GAGACTTCCA 
GTCAAATTTT CTTAGAAGTT TTTTTGGTCC AGATGTTCAT AAAGTCGCCG CTTTCATACT 
TTTTTTAATT TTTTAATTGG TGCACTATTA GGTACCTGTT GGAGGATGTT ACAGGCTTAT 
TGATATCCCT ATGAGTAACT GCTTCAACAG TGGTATAAAT AAGATATTTG TGATGAGTCA 
GTTCAATTCT ACTTCGCTTA ACCGCCATAT TCATCGTACA TACCTTGAAG GCGGGATCAA 
CTTTGCTGAT GGATCTGTAC AGGTGATTTA CCTCATCTTG TTGATGTGTA ATACTGTAAT 
TAGGAGTAGA TTTGTGTGGA GAGAATAATA AACAGATGCC GAGATTCTTT TCTAAAAGTC 
TAG AT C C AAA GGCATTGTGG TTCAAAACAC TATGGACTTC TACCATTTAT GTCATTACTT 
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TGCCTTAATG TTCCATTGAA TGGGGCAAAT TATTGATTCT ACAAGTGTTT AATTAAAAAC 3120 

TAATTGTTCA TCCTGCAGGT ATTAGCGGCT ACACAAATGC CTGAAGAGCC AGCTGGATGG 3180 

TTCCAGGGTA CAGCAGACTC TATCAGAAAA TTTATCTGGG TACTCGAGGT AGTTGATATT 3240 

TTCTCGTTTA TGAATGTCCA TTCACTCATT C C TGTAG CAT TGTTTCTTTG TAATTTTGAG 3300 

TTCTCCTGTA TTTCTTTAGG ATTATTACAG TCACAAATCC ATTGACAACA TTGTAATCTT 336 0 

GAGTGGCGAT CAGCTTTATC GGATGAATTA CATGGAACTT GTGCAGGTAT GGTGTTCTCT 3420 

TGTTCCTCAT GTTTCACGTA ATGTC CTGAT TTTGGATTAA CCAACTACTT TTGGC ATGCA 3480 

TTATTTCCAG AAACATGTCG AGGACGATGC TGATATCACT ATATCATGTG CTCCTGTTGA 3540 

TGAGAGGTAA TCAGTTGTTT ATATCATCCT AATATGAATA TGTCATCTTG TTATCCAACA 3600 

CAGGATGCAT ATGGTCTAAT CTGCTTTCCT TTTTTTTCCC TTCGGAAGCC GAGCTTCTAA 3660 

AAATGGGCTA GTGAAGATTG ATCATACTGG ACGTGTACTT CAATTCTTTG AAAAACCAAA 3720 

GGGTGCTGAT TTGAATTCTA TGGTTAGAAA TTCCTTGTGT AATCCAATTC TTTTGTTTTC 37 80 

CTTTCTTTCT TGAGATGAAC CCCTCTTTTA GTTATTTCCA TGGATAACCT GTACTTGACT 3840 

TATTCAGAAA TGATTTTCTA TTTTGCTGTA GAATCTGACA CTAAAGCTAA TAGCACTGAT 3900 

GTTGCAGAGA GTTGAGACCA ACTTCCTGAG CTATGCTATA GATGATGCAC AGAAATATCC 396 0 

ATACCTTGCA TCAATGGGCA TTTATGTCTT CAAGAAAGAT GCACTTTTAG ACCTTCTCAA 4020 

GTAATCACTT TCCTGTGACT TATTTCTATC CAACTCCTAG TTTACCTTCT AACAGTGTCA 40 80 

ATTCTTAGGT CAAAATATAC TCAATTACAT GACTTTGGAT CTGAAATCCT CCCAAGAGCT 414 0 

GTACTAGATC ATAGTGTGCA GGTAAGTCTG ATCTGTCTGG AGTATGTGTT CTGTAAACTG 4200 

TAAATTCTTC ATGTCAAAAA GTTGTTTTTG TTTCCAGTTT CCACTACCAA TGCACGATTT 4260 

ATGTATTTTC GCTTCCATGC ATCATACATA CTAACAATAC ATTTTACGTA TTGTGTTAGG 4320 

CATGCATTTT TACGGGCTAT TGGGAGGATG T T GG AACAAT CAAATCATTC TTTGATGCAA 4380 

ACTTGGCCCT CACTGAGCAG GTACTCTGTC ATGTATTCTG TACTGCATAT ATATTACCTG 4440 

GAATTCAATG CATAGAATGT GTTAGACCAT CTTAGTTCCA TCCTGTTTTC TTCAATTAGC 4500 

TTATCATTTA ATAGTTGTTG GCTAGAATTT AAACACAAAT TTACCTAATA TGTTTCTCTC 4560 

TTCAGCCTTC CAAGTTTGAT TTTTACGATC CAAAAACACC TTTCTTCACT GCACCCCGAT 462 0 

GCTTGCCTCC GACGCAATTG GACAAGTGCA AGGTATATGT CTTACTGAGC ACAATTGTTA 46 80 

CCTCAGCAAG ATTTTGTGTA CTTGACTTGT TCTCCTCCAC AGATGAAATA TGCATTTATC 4740 
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TCAGATGGTT GCTTACTGAG AGAATGCAAC ATCGAGCATT CTGTGATTGG AGTCTGCTCA 4800 

CGTGTCAGCT CTGGATGTGA ACTCAAGGTA CATACTCTGC CAATGTATCT ACTCTTGAGT 486 0 

ATACCATTTC AACACCAAGC ATCACCAAAT CACACAGAAC AATAGCAACA AAGCCTTTTA 4920 

GTTCCAAGCA ATTTAGGGTA GCCTAGAGTT GAAATCTAAC AAAACAAAAG TCAAAGCTCT 4980 

AT C ACGTGGA TAGTTGTTTT CCATGCACTC TTATTTAAGC TAATTTTTTG GGTATACTAC 5040 

ATCCATTTAA TTATTGTTTT ATTGCTTCTT CCCTTTGCCT TTCCCCCATT ACTATCGCGT 5X00 

CTTAAGATCA TACTACGCAC TAGTGTCTTT AGAGGTCTCT GGTGGACATG TTCAAACCAT 5160 

CTCAATCGGT GTTGGACAAG TTTTTCTTGA ATTTGTGCTA CACCTAACCT ATCACGTATG 5220 

TCATCGTTTC AAACTCGATC CTTCCTGTAT CAT CAT AAAT CCAATGCAAC ATACGCATTT 5280 

ATGCAACATT TATCTGTTGA ACATGTCATC TTTTTGTAGG TTAACATTAT G C AC CAT AC A 5340 

ATGTAGCATG TCTAATCATC ATC CTATAAA ATTTACATTT TAGCTTATGT GGTATCCTCT 5400 

TGCCACTTAG AACACCATAT GCTTGATGCC ATTTCATCCA CCCTGCTTTG ATTCTATGGC 546 0 

TAACATCTTC ATTAATATCC TCGCCTCTCT GTATCATTGG TCCTAAATAT GGAAATACAT 5520 

TCTTTCTGGG CACTACTTGA CCTTCCAAAC TAACGTCTCC TTTGCTCCTT TCTTGTGTGT 5580 

AGTAGTACCG AAGTCACATC TCATATATTC GGTTTTAGTT CTACTAAGTC CCGGGTTCGA 5640 

TCCCCCTCAG GGGTGAATTT CGGGCTTGGT AAAAAAAATC CCCTCGCTGT GTCCCGCCCG 5700 

CTCTCGGGGA TCGATATCCT GCGCGCCACC CTCCGGCTGG GCATTGCAGA GTGAGCAGTT 5760 

GATCGGCTCG TTAGTGATGG GGAGCGGGGT TCAAGGGTTT TCTCGGCCGG GAC C ATGTTT 5820 

CGGTCTCTTA ATATAATGCC GGGAGGG C AG TCTTTCCCTC CC C GGTCG AG TTTTAGTTCT 5880 

ACCGAGTCTA AAACCTTTGG ACTCTAGAGT CCCCTGTCAC AACTCACAAC TCTAGTTTTC 5940 
TATTTACTTC TACCTAGCGT TTATTAATGA TC ACT AT ATC GTCTGTAAAA AGCATACACC 6000 
AATGTAATCC CCTTGTATGT CCCTTGTAAT ATTATCCATC ACAAGAAAAA AAGGTAAGGC 6060 
TCAAAGTTGA CTTTTGATAT AGTCCTATTC TAATCGAGAA GTCATCTGTA TCTTCGTCTC 6120 
TTGTTCGAAC ACTAGTCACA AAATTTTTTG TACATGTTCT TAATGAGTCC AACGTAATAT 6180 
TCCTTGATAT TTTGTCATAA GCCCTCATCA AGTCAATGAA AATCACGTGT AGGTCCTTCA 6240 
TTTGTTCCTT ATACTGCTCC ATCACTTGTC TC-.TTAAGAA AATCTCTCTC ATAGTTAACC 6300 
TTTTGGCATG AAACAAAATC ACACAGAAGT TGTTTCCTTT ' TTTTAAGATC CCACACAAAA 6360 
GAGGTTTGAT CTAAGGAATC TGGATCCCTG ACAGGTTTAT CAAAATCCTT TGTGTTTTTC 6420 
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TTAAAACTGA ATATTCCTCC AGCTTCTAGT ATTGATGTAA TATTCAATCT GTTTAGCAAG 6480 

TGAACACCTT GGTTCTTGTT GTTACTGTAC CCCCCCCCCC CCCCCCCCCC CGAGGCCCAG 6540 

ATTACCACGA CATGAATACA AGAATATTGA ACCCAGATCT AGAGTTTGTT TGTACTGTTG 6600 

AAAATCGGTG ACAATTCATT TTGTTATTGC GCTTTCTGAT AACGACAGGA CTCCGTGATG 6660 

ATGGGAGCGG ACACCTATGA AAC TGAAGAA GAAGCTTCAA AGCTACTGTT AGCTGGGAAG 6720 

GTCCCAGTTG GAATAGGAAG GAACACAAAG ATAAGGTGAG TATGGATGTG GAACCACCGG 6780 

TTAGTTCCCA AAAATATCAC TCACTGATAC CTGATGGTAT CCTCTGATTA TTTTCAGGAA 6B40 

CTGTATCATT GACATGAATG CTAGG AT TGG GAAGAACGTG GTGATCACAA ACAGTAAGGT 6900 

GAGCGAGCGC ACCTACATGG GTGCAGAATC TTGTGTGCTC ATCTATCCTA ATTCGGTAAT 6960 

TCCTATCCAG CGCTAGTCTT GTGACCATGG GGCATGGGTT CGACTCTGTG ACAGGGCATC 7020 

CAAGAGGCTG ATCACCCGGA AGAAGGGTAC TCGTACTACA TAAGGTCTGG AATCGTGGTG 7080 

ATCTTGAAGA ATG C AAC CAT CAACGATGGG TCTGTCATAT AGATCGGCTG CGTGTGCGTC 7140 

TACAAAACAA GAACCTACAA TGGTATTGCA TCGATGGATC GTGTAACCTT GGTATGGTAA 7200 

GAGCCGCTTG ACAGAAAGTC GAGCGTTCGG GCAAGATGCG TAGTCTGGCA TGCTGTTCCT 7260 

TGACCATTTG TGCTGCTAGT ATGTACTGTT ATAAGCTGCC CTAGAAGTTG CAGCAAACCT 7320 

TTTTATGAAC CTTTGTATTT CCATTACCTG CTTTGGATCA ACTATATCTG TCATCCTATA 7380 

TATTACTAAA TTTTTACGTG TTTTTCTAAT TCGGTGCTGC TTTTGGGATC TGGCTTCGAT 7440 

GACCGCTCGA CCCTGGGCCA TTGGTTCAGC TCTGTTCCTT AGAGCAACTC CAAGGAGTCC 7500 

TAAATTTTGT ATTAGATACG AAGGACTTCA GCCGTGTATG TCGTCCTCAC CAAACGCTCT 7560 

TTTTGCATAG TGCAGGGGTT GTAGACTTGT AGCCCTTGTT TAAAGAGGAA TTTGAATATC 7620 

AAATTATAAG TATTAAATAT ATATTTAATT AGGTTAACAA ATTTGGCTCG TTTTTAGTCT 7680 

TTATTTATGT AATTAGTTTT AAAAATAGAC CTATATTTCA ATACGAAATA TCATTAACAT 7740 

CGATA 774 5 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1919 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ACAAGATCAC TTCGGGAGGC AAGTGCGATT TTGATCTTGC AGCCACCTTT TTTTGTTCTG 60 

TTGTGTATCT AGTAGTTGGA GGAGATATGC AGTTTGCACT TGCATTGGAC ACGAACTCAG 120 

GTCCTCACCA GATAAGATCT TGTGAGGGTG ATGGGATTGA CAGGTTGGAA AAATTAAGTA 180 

TTGGGGGCAG AAAGCAGGAG AAAGCTTTGA GAAATAGGTG CTTTGGTGGT AGAGTTGCTG 240 

CAACTACACA ATGTATTCTT ACCTCAGATG CTTGTCCTGA AACTCTTCAT TCTCAAACAC 300 

AGTCCTCTAG GAAAAATTAT GCTGATGCAA ACCGTGTATC TGCGATCATT TTGGGCGGAG 360 

GCACTGGATC TCAGCTCTTT CCTCTGACAA GCACAAGAGC TACGCCTGCT GTACCTGTTG 420 

GAGGATGTTA CAGGCTTATT GATATCCCTA TGAGTAACTG CTTCAACAGT GGTATAAATA 480 

AGATATTTGT GATGAGTCAG TTCAATTCTA CTTCGCTTAA CCGCCATATT CATCGTACAT 540 

AC CTTGAAGG CGGGATCAAC TTTGCTGATG GATCTGTACA GGTATTAGCG G C T AC AC AAA 600 

TGCCTGAAGA GCCAGCTGGA TGGTTCCAGG GTACAGCAGA CTCTATCAGA AAATTTATCT 660 

GGGTACTCGA GGATTATTAC AGTCACAAAT CCATTGACAA CATTGTAATC TTGAGTGGCG 720 

ATCAGCTTTA TCGGATGAAT TACATGGAAC TTGTGCAGAA ACATGTCGAG GACGATGCTG 780 

ATATCACTAT ATCATGTGCT CCTGTTGATG AGAGCCGAGC TTCTAAAAAT GGGC TAGTG A 840 

AGATTGATCA TACTGGACGT GTACTTCAAT TCTTTGAAAA ACCAAAGGGT GCTGATTTGA 900 

ATTCTATGAG AGTTGAGACC AACTTCCTGA GCTATGCTAT AGATGATGCA CAGAAATATC 960 

CATACCTTGC AT CAATGGG C ATTTATGTCT TCAAGAAAGA TGCACTTTTA GACCTTCTCA 1020 

AGTCAAAATA TACTCAATTA CATGACTTTG GAT CTGAAAT CCTCCCAAGA GCTGTACTAG 1080 

ATCATAGTGT GCAGGCATGC ATTTTTACGG GCTATTGGGA GGATGTTGGA ACAATCAAAT 1140 

CATTCTTTGA TGCAAACTTG GCCCTCACTG AGCAGCCTTC CAAGTTTGAT TTTT AC GATC 1200 

CAAAAACACC TTTCTTCACT GCACCCCGAT GCTTGCCTCC GACGCAATTG GACAAGTGCA 1260 

AGATGAAATA TGCATTTATC TCAGATGGTT GCTTACTGAG AGAATGCAAC AT CG AG C ATT 1320 

CTGTGATTGG AGTCTGCTCA CGTGTCAGCT CTGGATGTGA ACTCAAGGAC TCCGTGATGA 1380 

TGGGAGCGGA CATCTATGAA ACTGAAGAAG AAGCTTCAAA GCTACTGTTA GCTGGGAAGG 1440 

TCCCGATTGG AATAGGAAGG AACACAAAGA TAAGGAACTG TAT C ATT G AC ATGAATGCTA 1500 

GGATTGGGAA GAACGTGGTG ATCACAAACA GT AAGGG CAT CCAAGAGGCT GATCACCCGG 1560 

AAGAAGGGTA CTCGTACTAC ATAAGGTCTG GAATCGTGGT GATCCTGAAG AATGCAACCA 1620 
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TCAACGATGG GTCTGTCATA TAGATCGGCT GCGTTTGCGT CTACAAAACA AGAACCTACA 1680 

ATGGTATTGC ATCGATGGAT CGTGTAACCT TGGTATGGTA AGAGCCGCTT GACAGGAAGT 1740 

CGAGCTTCGG GCGAAGATGC TAGTCTGGCA TGCTGTTCCT TGACCATTTG TGCTGCTAGT 1800 

ATGTACCTGT TATAAGCTGC CCTAGAAGTT GCAGCAAACC TTTTTATGAA CCTTTGTATT 1860 

TCCATTACCC TGCTTTGGAT CAACTATATC TGTCAGTCCT ATATATTACT AAATTTTTA 1919 

(2) INFORMATION FOR SEQ ID NOj3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 518 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Gin Phe Ala Leu Ala Leu Asp Thr Asn Ser Gly Pro His Gin lie 
15 10 15 

Arg Ser Cys Glu Gly Asp Gly lie Asp Arg Leu Glu Lys Leu Ser lie 
20 25 30 

Gly Gly Arg Lys Gin Glu Lys Ala Leu Arg Asn Arg Cys Phe Gly Gly 
35 40 45 

Arg Val Ala Ala Thr Thr Gin Cys He Leu Thr Ser Asp Ala Cys Pro 
50 55 60 

Glu Thr Leu His Ser Gin Thr Gin ser Ser Arg Lys Asn Tyr Ala Asp 
65 70 75 80 

Ala Asn Arg Val Ser Ala He He Leu Gly Gly Gly Thr Gly ser Gin 
85 90 95 

Leu Phe Pro Leu Thr Ser Thr Arg Ala Thr Pro Ala Val Pro Val Gly 
100 105 HO 

Gly Cys Tyr Arg Leu He Asp He Pro Met Ser Asn cys Phe Asn Ser 
115 120 125 

Gly He Asn Lys He Phe Val Met Ser Gin Phe Asn Ser Thr Ser Leu 
130 135 140 

Asn Arg His He His Arg Thr Tyr Leu Glu Gly Gly lie Asn Phe Ala 
145 150 155 160 

Asp Gly ser Val Gin Val Leu Ala Ala Thr Gin Met Pro Glu Glu Pro 
165 170 175 
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Ala Gly Trp Phe Gin Gly Thr Ala Asp Ser lie Arg Lys Phe lie Trp 
180 185 190 

val Leu Glu Asp Tyr Tyr ser His Lys ser He Asp Asn He Val He 
195 200 205 

Leu ser Gly Asp Gin Leu Tyr Arg Met Asn Tyr Met Glu Leu Val Gin 
210 215 220 

Lys His Val Glu Asp Asp Ala Asp He Thr He ser Cya Ala Pro Val 
225 230 235 240 

Asp Glu Ser Arg Ala Ser Lys Asn Gly Leu Val Lys He Asp His Thr 
245 250 255 

Gly Arg Val Leu Gin Phe Phe Glu Lys Pro Lys Gly Ala Asp Leu Asn 
260 265 270 

Ser Met Arg Val Glu Thr Asn Phe Leu Ser Tyr Ala He Asp Asp Ala 
275 280 285 

Gin Lys Tyr Pro Tyr Leu Ala ser Met Gly He Tyr Val Phe Lys Lys 
290 295 300 

Asp Ala Leu Leu Asp Leu Leu Lys Ser Lys Tyr Thr Gin Leu His Asp 
305 310 315 320 

Phe Gly ser Glu He Leu Pro Arg Ala val Leu Asp His ser Val Gin 
325 330 335 

Ala cys He Phe Thr Gly Tyr Trp Glu Asp Val Gly Thr He Lys ser 
340 345 350 

Phe Phe Asp Ala Asn Leu Ala Leu Thr Glu Gin Pro Ser Lys Phe Asp 
355 360 365 

Phe Tyr Asp Pro Lys Thr Pro Phe Phe Thr Ala Pro Arg Cys Leu Pro 
370 375 380 

pro Thr Gin Leu Asp Lys Cys Lys Met Lys Tyr Ala Phe He Ser Asp 
385 390 395 400 

Gly cys Leu Leu Arg Glu cys Asn He Glu His Ser val He Gly Val 
405 410 415 

cys Ser Arg Val Ser Ser Gly Cys Glu Leu Lys Asp ser Val Met Met 
420 425 430 

Gly Ala Asp He Tyr Glu Thr Glu Glu Glu Ala Ser Lys Leu Leu Leu 
435 440 445 

Ala Gly Lys val Pro He Gly He Gly Arg Asn Thr Lys He Arg Asn 
450 455 460 

Cys He He Asp Met Asn Ala Arg He Gly Lys Asn Val Val He Thr 
465 * 470 475 480 
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Asn ser Lys Gly He Gin Glu Ala Asp His Pro Glu Glu Gly Tyr Ser 
485 490 495 

Tyr Tyr He Arg Ser Gly He Val Val He Leu Lys Asn Ala Thr lie 
500 505 510 

Asn Asp Gly Ser Val He 
515 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1551 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: CDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

ATGCAGTTTG CACTTGCATT GGACACGAAC TCAGGTCCTC ACCAGATAAG ATCTTGTGAG 6 0 

GGTGATGGGA TTGACAGGTT GGAAAAATTA AGTATTGGGG GCAGAAAGCA GGAGAAAGCT 120 

TTGAGAAATA GGTGCTTTGG TGGTAGAGTT GCTGCAACTA CACAATGTAT TCTTACCTCA 180 

GATGCTTGTC CTGAAACTCT TCATTCTCAA ACACAGTCCT CTAGGAAAAA TTATGCTGAT 240 

GCAAAC CGTG TATCTGCGAT CATTTTGGGC GGAGGCACTG GATCTCAGCT CTTTCCTCTG 300 

ACAAGCACAA GAGCTACGCC TGCTGTACCT GTTGGAGGAT GTTACAGGCT TATTGATATC 360 

CCTATGAGTA ACTGCTTCAA CAGTGGTATA AATAAGATAT TTGTGATGAG TCAGTTCAAT 420 

TCTACTTCGC TTAACCGCCA TATTCATCGT ACATACCTTG AAGGCGGGAT CAACTTTGCT 480 

GATGGATCTG TACAGGTATT AG C GG C T AC A CAAATGCCTG AAGAGCCAGC TGGATGGTTC 540 

C AGGGTAC AG CAGACTCTAT CAGAAAATTT ATCTGGGTAC TCGAGGATTA TTACAGTCAC 600 

AAATCCATTG AC AACAT TGT AATCTTGAGT GGCGATCAGC TTTATCGGAT GAATTACATG 660 

GAACTTGTGC AGAAACATGT CGAGGACGAT GCTGATATCA CTATATCATG TGCTCCTGTT 720 

GATGAGAGCC GAG C TTCTAA AAATGGGCTA GTGAAGATTG ATCATACTGG ACGTGTACTT 780 

CAATTCTTTG AAAAACCAAA GGGTGCTGAT TTGAATTCTA TG AG AG TTG A GACCAACTTC 840 

CTGAGCTATG CTATAGATGA TGCACAGAAA TATCCATACC TTGCATCAAT GGGCATTTAT 900 

GTCTTCAAGA AAGATGCACT TTTAGACCTT CTCAAGTCAA AATATACTCA ATTACATGAC 960 

TTTGGATCTG AAATCCTCCC AAGAGCTGTA CTAGATCATA GTGTGCAGGC ATGCATTTTT 1020 

ACGGGCTATT GGGAGGATGT TGGAACAATC AAATCATTCT TTGATGCAAA CTTGGCCCTC 1080 
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ACTGAGCAGC 


CTTCCAAGTT 


TGATTTTTAC 


GATCCAAAAA 


CACCTTTCTT 


CACTGCACCC 


1140 


CGATGCTTGC 


CTCCGACGCA 


ATTGGACAAG 


TGCAAGATGA AATATGCATT 


TATCTCAGAT 


1200 


GGTTGCTTAC 


TGAGAGAATG 


CAACATCGAG 


CATTCTGTGA 


TTGGAGTCTG 


CTCACGTGTC 


1260 


AGCTCTGGAT 


GTGAACTCAA 


GGACTCCGTG 


ATGATGGGAG 


CGGACATC T A 


TGAAACTGAA 


1320 


GAAGAAGCTT 


CAAAGCTACT 


GTTAGCTGGG 


AAGGTCCCGA 


TTGGAATAGG 


AAGGAACACA 


13B0 


AAGATAAGGA ACTGTATCAT 


TGACATGAAT 


GCTAGGATTG 


GGAAGAACGT 


GGTGATCACA 


1440 


AACAGTAAGG 


GCATCCAAGA 


GGCTGATCAC 


CCGGAAGAAG 


GGTCCTACTA 


CATAAGGTCT 


1500 


GGAATCGTGG 


TGATCCTGAA 


GAATGCAACC 


ATCAACGATG 


GGTCTGTCAT 


A 


1551 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 517 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOliOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Gin Phe Ala Leu Ala Leu Asp Thr Asn Ser Gly Pro His Gin lie 
1 5 10 15 

Arg ser Cys Glu Gly Asp Gly lie Asp Arg Leu Glu Lys Leu ser lie 
20 25 30 

Gly Gly Arg Lys Gin Glu Lys Ala Leu Arg Asn Arg Cys Phe Gly Gly 
35 40 45 

Arg Val Ala Ala Thr Thr Gin Cys He Leu Thr ser Asp Ala cys Pro 
50 55 60 

Glu Thr Leu His Ser Gin Thr Gin Ser Ser Arg Lys Asn Tyr Ala Asp 
65 70 75 80 

Ala Asn Arg val Ser Ala He He Leu Gly Gly Gly Thr Gly ser Gin 
85 90 95 



Leu Phe Pro Leu Thr Ser Thr Arg Ala Thr Pro Ala Val Pro Val Gly 
100 105 110 

Gly Cys Tyr Arg Leu He Asp He Pro Met Ser Asn Cys Phe Asn Ser 
115 120 125 

Gly He Asn Lys He Phe Val Met ser Gin Phe Asn Ser Thr ser Leu 

130 * 135 140 
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Asn Arg His He His Arg Thr Tyr Leu Glu Gly Gly lie Asn Phe Ala 
145 150 155 160 

Asp Gly ser val Gin Val Leu Ala Ala Thr Gin Met Pro Glu Glu Pro 
165 170 175 

Ala Gly Trp Phe Gin Gly Thr Ala Asp Ser He Arg Lys Phe He Trp 
180 185 190 

Val Leu Glu Asp Tyr Tyr Ser His Lys ser He Asp Asn He Val He 
195 200 ' 205 

Leu Ser Gly Asp Gin Leu Tyr Arg Met Asn Tyr Met Glu Leu Val Gin 
210 215 220 

Lys His val Glu Asp Asp Ala Asp He Thr He ser cys Ala Pro Val 
225 230 235 240 

Asp Glu Ser Arg Ala Ser Lys Asn Gly Leu Val Lys He Asp His Thr 
245 250 255 

Gly Arg Val Leu Gin Phe Phe Glu Lys Pro Lys Gly Ala Asp Leu Asn 
260 265 270 

Ser Met Arg Val Glu Thr Asn Phe Leu Ser Tyr Ala lie Asp Asp Ala 
275 280 285 

Gin Lys Tyr Pro Tyr Leu Ala Ser Met Gly He Tyr val phe Lys Lys 
290 295 300 

Asp Ala Leu Leu Asp Leu Leu Lys Ser Lys Tyr Thr Gin Leu His Asp 
305 310 315 320 

Phe Gly Ser Glu He Leu Pro Arg Ala Val Leu Asp His Ser Val Gin 
325 330 335 

Ala Cys lie Phe Thr Gly Tyr Trp Glu Asp Val Gly Thr lie Lys Ser 
340 345 350 

Phe Phe Asp Ala Asn Leu Ala Leu Thr Glu Gin Pro Ser Lys Phe Asp 
355 360 365 

Phe Tyr Asp Pro Lys Thr Pro Phe Phe Thr Ala Pro Arg CyB Leu Pro 
370 375 380 

Pro Thr Gin Leu Asp Lys Cys Lys Met Lys Tyr Ala Phe He Ser Asp 
385 390 395 400 

Gly Cys Leu Leu Arg Glu Cys Asn He Glu His Ser Val He Gly Val 
405 410 415 



Cys ser Arg Val Ser Ser Gly Cys Glu Leu Lys Asp Ser Val Met Met 
420 425 ~ 430 

Gly Ala Asp He Tyr Glu Thr Glu Glu Glu Ala ser Lys Leu Leu Leu 
435 440 445 
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Ala Gly Lys Val Pro lie Gly lie Gly Arg Asn Thr Lys lie Arg Asn 
450 455 460 

cys lie lie Asp Met Asn Ala Arg lie Gly Lys Asn val val He Thr 
465 470 475 480 



Asn Ser Lys Gly He Gin Glu Ala Asp His Pro Glu Glu Gly Ser Tyr 

485 490 495 

Tyr He Arg Ser Gly He Val Val He Leu Lys Asn Ala Thr He Asn 

500 505 510 



Asp Gly Ser Val He 
515 
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Claims 



1 1 . A polynucleotide molecule, comprising a variant of the wild type shrunken-2 (Sh2) gene, 

2 wherein said variant codes for the insertion of at least one additional amino acid within or close to 

3 the allosteric binding site of the ADP -glucose pyrophosphorylase (AGP) enzyme subunit, whereby 

4 a plant expressing said polynucleotide molecule has increased seed weight relative to the seed weight 

5 of a plant expressing the wild type Sh2 gene. 

1 2. The polynucleotide molecule, according to claim 1, wherein said polynucleotide molecule 

2 encodes at least one serine residue inserted between amino acids 494 and 495 of the native AGP 

3 enzyme subunit. 

1 3. The polynucleotide molecule, according to claim 1, wherein said polynucleotide molecule 

2 encodes the amino acid pair tyrosine:serine, wherein said amino acid pair is inserted between amino 

3 acids 494 and 495 of the native AGP enzyme subunit. 

1 4. The polynucleotide molecule, according to claim I, wherein said polynucleotide molecule 

2 encodes the amino acid pair serine:tyrosine, wherein said amino acid pair is inserted between amino 

3 acids 495 and 496 of the native AGP enzyme subunit. 

1 5 . The polynucleotide molecule, according to claim 1 , wherein the AGP enzyme encoded 



2 by said polynucleotide molecule consists essentially of an amino acid sequence selected from the 

3 group consisting of SEQ ID NO. 5 and SEQ ID NO. 3. 

1 6. The polynucleotide molecule, according to claim 5, wherein the nucleotide sequence 

2 encoding SEQ ID NO. 3 comprises nucleotides 87 through 1 640 of the sequence shown in SEQ ID 

3 NO. 2 or a degenerate fragment thereof. 

1 7. A method for increasing the seed weight of a plant, comprising incorporating the 

2 polynucleotide molecule of claim 1 into the genome of said plant and expressing the protein encoded 

3 by said polynucleotide molecule. 

1 8. The method, according to claim 7, wherein said plant is lea mays. 



WO 98/10082 



PCT/US96/14244 



24 

1 9. A plant seed comprising the polynucleotide molecule of claim 1 within the genome of 

2 said seed. 



1 1 0. A plant expressing the polynucleotide molecule of claim 1 . 

1 11. The plant, according to claim 10, wherein said plant is Zea mays. 

1 12. The plant, according to claim 10, wherein said plant is grown from the seed of claim 

2 9. 

1 13. A variant ADP-glucose pyrophosphorylase (AGP) protein, wherein said protein has at 

2 least one additional amino acid inserted within or close to the allosteric binding site of the wild-type 

3 AGP protein. 

1 14. The variant AGP protein, according to claim 13, wherein said protein has at least one 

2 serine residue inserted between amino acids 494 and 495 of the wild type AGP protein sequence. 

1 1 5. The variant AGP protein, according to claim 1 1, wherein said protein has the amino 

2 acid pair tyrosine:serine inserted between amino acids 494 and 495 of the wild-type AGP protein 

3 sequence. 

1 16. The variant AGP protein, according to claim 1 1, wherein said protein has the amino 

2 acid pair serincityrosine inserted between amino acids 495 and 496 of the wild-type AGP protein 

3 sequence. 

1 17. The variant AGP protein, according to claim 13, wherein said protein consists 

2 essentially of an amino acid sequence selected from the group consisting of SEQ ID NO . 5 and SEQ 

3 ID NO. 3. 

1 18. The variant AGP protein, according to claim 13, wherein said protein is expressed in 

2 the endosperm of a plant during seed development. 
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